骨架序列是紧凑而轻巧的。已经提出了许多基于骨架的动作识别者来对人类行为进行分类。在这项工作中,我们旨在结合与现有模型兼容的组件,并进一步提高其准确性。为此,我们设计了两个时间配件:离散余弦编码(DCE)和按时间顺序损失(CRL)。DCE促进模型以分析频域的运动模式,同时减轻信号噪声的影响。CRL指导网络明确捕获序列的时间顺序。这两个组件一致地赋予许多最近提供的动作识别器具有准确性的提升,从而在两个大数据集上实现了新的最先进(SOTA)精度。
translated by 谷歌翻译
基于骨架的动作识别由于数据集的轻质,紧凑的性质,吸引了从业者和研究人员。与基于RGB视频的动作识别相比,基于骨架的动作识别是一种更安全的方法来保护受试者的隐私,同时具有竞争性识别性能。但是,由于骨架估计算法以及运动和深度传感器的改进,可以在骨架数据集中保留运动特性的更多细节,从而导致数据集的潜在隐私泄漏。要调查骨架数据集的潜在隐私泄漏,我们首先将分类器从关节的轨迹中分类敏感私人信息。实验表明,培训的模型对性别进行分类,可以预测88%的准确性,并重新识别具有82%的准确性的人。我们提出了两个匿名化算法的变体来保护骨架数据集的潜在隐私泄漏。实验结果表明,匿名数据集可以降低隐私泄漏的风险,同时对动作识别性能产生边际影响。
translated by 谷歌翻译
众包系统使我们能够从人群工人收集嘈杂的标签。代表工人和任务之间的本地依赖性的图形模型提供了一种原理的推理方式,从嘈杂的答案中获得真正的标签。然而,人们需要一个预测模型,直接从众包数据集上工作,而不是在许多情况下都是真正的标签。为了推断真标并同时学习预测模型,我们提出了一种新的数据生成过程,其中神经网络从任务特征生成真正的标签。我们设计了EM框架交替的变分推理和深度学习,以便分别推断出真正的标签并更新神经网络。与合成和实时数据集的实验结果显示了基于信仰传播的EM算法对i)任务特征损坏,ii)以前的多模态或不匹配的工作人员,并且iii)少数垃圾邮件发送者向许多任务提交噪声。
translated by 谷歌翻译
骨架序列轻巧且紧凑,因此是在边缘设备上进行动作识别的理想候选者。最新的基于骨架的动作识别方法从3D关节坐标作为时空提示提取特征,在图神经网络中使用这些表示形式来提高识别性能。一阶和二阶特征(即关节和骨骼表示)的使用导致了很高的精度。但是,许多模型仍然被具有相似运动轨迹的动作所困惑。为了解决这些问题,我们建议以角度编码为现代体系结构的形式融合高阶特征,以稳健地捕获关节和身体部位之间的关系。这种与流行的时空图神经网络的简单融合可在包括NTU60和NTU120在内的两个大型基准中实现新的最新精度,同时使用较少的参数和减少的运行时间。我们的源代码可公开可用:https://github.com/zhenyueqin/angular-skeleton-soding。
translated by 谷歌翻译
几乎没有射击的对象检测(FSOD)旨在对新类别的几幅图像进行分类和检测。现有的元学习方法由于结构限制而在支持和查询图像之间的功能不足。我们提出了一个层次的注意网络,该网络具有依次大的接收场,以充分利用查询和支持图像。此外,元学习不能很好地区分类别,因为它决定了支持和查询图像是否匹配。换句话说,基于度量的分类学习是无效的,因为它不直接起作用。因此,我们提出了一种称为元对抗性学习的对比学习方法,该方法直接有助于实现元学习策略的目的。最后,我们通过实现明显的利润来建立一个新的最新网络。我们的方法带来了2.3、1.0、1.3、3.4和2.4 \%\%\%AP的改进,可在可可数据集上进行1-30张对象检测。我们的代码可在以下网址找到:https://github.com/infinity7428/hanmcl
translated by 谷歌翻译
在本文中,我们提出了一种节能的SNN体系结构,该体系结构可以通过提高的精度无缝地运行深度尖峰神经网络(SNN)。首先,我们提出了一个转换意识培训(CAT),以减少无硬件实施开销而无需安排SNN转换损失。在拟议的CAT中,可以有效利用用于在ANN训练过程中模拟SNN的激活函数,以减少转换后的数据表示误差。基于CAT技术,我们还提出了一项首要尖峰编码,该编码可以通过使用SPIKE时间信息来轻巧计算。支持提出技术的SNN处理器设计已使用28nm CMOS流程实施。该处理器的推理能量分别为486.7UJ,503.6UJ和1426UJ的最高1级准确性,分别为91.7%,67.9%和57.4%,分别为CIFAR-10,CIFAR-100和TININE-IMIMAGENET处理。16具有5位对数权重。
translated by 谷歌翻译
The 3D-aware image synthesis focuses on conserving spatial consistency besides generating high-resolution images with fine details. Recently, Neural Radiance Field (NeRF) has been introduced for synthesizing novel views with low computational cost and superior performance. While several works investigate a generative NeRF and show remarkable achievement, they cannot handle conditional and continuous feature manipulation in the generation procedure. In this work, we introduce a novel model, called Class-Continuous Conditional Generative NeRF ($\text{C}^{3}$G-NeRF), which can synthesize conditionally manipulated photorealistic 3D-consistent images by projecting conditional features to the generator and the discriminator. The proposed $\text{C}^{3}$G-NeRF is evaluated with three image datasets, AFHQ, CelebA, and Cars. As a result, our model shows strong 3D-consistency with fine details and smooth interpolation in conditional feature manipulation. For instance, $\text{C}^{3}$G-NeRF exhibits a Fr\'echet Inception Distance (FID) of 7.64 in 3D-aware face image synthesis with a $\text{128}^{2}$ resolution. Additionally, we provide FIDs of generated 3D-aware images of each class of the datasets as it is possible to synthesize class-conditional images with $\text{C}^{3}$G-NeRF.
translated by 谷歌翻译
In both terrestrial and marine ecology, physical tagging is a frequently used method to study population dynamics and behavior. However, such tagging techniques are increasingly being replaced by individual re-identification using image analysis. This paper introduces a contrastive learning-based model for identifying individuals. The model uses the first parts of the Inception v3 network, supported by a projection head, and we use contrastive learning to find similar or dissimilar image pairs from a collection of uniform photographs. We apply this technique for corkwing wrasse, Symphodus melops, an ecologically and commercially important fish species. Photos are taken during repeated catches of the same individuals from a wild population, where the intervals between individual sightings might range from a few days to several years. Our model achieves a one-shot accuracy of 0.35, a 5-shot accuracy of 0.56, and a 100-shot accuracy of 0.88, on our dataset.
translated by 谷歌翻译
Feature selection helps reduce data acquisition costs in ML, but the standard approach is to train models with static feature subsets. Here, we consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. DFS is often addressed with reinforcement learning (RL), but we explore a simpler approach of greedily selecting features based on their conditional mutual information. This method is theoretically appealing but requires oracle access to the data distribution, so we develop a learning approach based on amortized optimization. The proposed method is shown to recover the greedy policy when trained to optimality and outperforms numerous existing feature selection methods in our experiments, thus validating it as a simple but powerful approach for this problem.
translated by 谷歌翻译
The purpose of this work was to tackle practical issues which arise when using a tendon-driven robotic manipulator with a long, passive, flexible proximal section in medical applications. A separable robot which overcomes difficulties in actuation and sterilization is introduced, in which the body containing the electronics is reusable and the remainder is disposable. A control input which resolves the redundancy in the kinematics and a physical interpretation of this redundancy are provided. The effect of a static change in the proximal section angle on bending angle error was explored under four testing conditions for a sinusoidal input. Bending angle error increased for increasing proximal section angle for all testing conditions with an average error reduction of 41.48% for retension, 4.28% for hysteresis, and 52.35% for re-tension + hysteresis compensation relative to the baseline case. Two major sources of error in tracking the bending angle were identified: time delay from hysteresis and DC offset from the proximal section angle. Examination of these error sources revealed that the simple hysteresis compensation was most effective for removing time delay and re-tension compensation for removing DC offset, which was the primary source of increasing error. The re-tension compensation was also tested for dynamic changes in the proximal section and reduced error in the final configuration of the tip by 89.14% relative to the baseline case.
translated by 谷歌翻译